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In this article, we have introduced an advanced new method of solving a 
traveling salesman problem (TSP) with the whale optimization algorithm 
(WOA), and K-means which is a partitioning-based algorithm used in 
clustering. The whale optimization algorithm first was introduced in 2016 
and later used to solve a TSP problem. In the TSP problem, finding the best 


path, which is the path with the lowest value in the fitness function, has 
always been difficult and time-consuming. In our algorithm, we want to find 
the best tour by combining it with K-means which is a clustering method. In 
other words, we want to divide our problem into smaller parts called 
clusters, and then we join the clusters based on their distances. To do this, 
the WOA algorithm, TSP, and K-means must be combined. Separately, the 
WOA-TSP algorithm which is an unclustered algorithm is also implemented 
to be compared with the proposed algorithm. The results are shown through 
some figures and tables, which prove the effectiveness of this new method. 
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1. INTRODUCTION 

Solving the traveling salesman problem (TSP) has always been one of the topics of interest to 
researchers for example solving the vehicle routing problem that was first introduced by Dantzig et al. [1], 
[2]. The problem of determining the travel route is similar to finding the solution from the TSP problem [3]. 
Since TSP is an NP-hard problem and it takes a protracted time to search out a tour among the cities, the 
complexity order of this problem becomes exponential which does not have a suitable execution time [4], but 
an intelligent method requires less computation time and more accuracy [5]. Metaheuristic algorithms are 
powerful methods for solving many tough optimization problems [6]. In the past three decades, meta- 
heuristics pose a potential impact on tasks of managing operational problems (e.g. assignment and 
scheduling) [7]. 

Swarm intelligence is an optimization technique that looks for the domain by developing a 
population of individuals. These individuals represent solutions for our problems [8]. The individuals move 
to the better solution areas. Each iteration forces the individuals to cooperate till reaching the best solution 
[8]-[10]. The swarm-based algorithms have been proven to be effective in solving nonlinear optimization 
problems in a large space search domain [11]-[13]. Over the last decades, nature has been a source of 
inspiration for several meta-heuristics, which have been introduced to solve optimization problems. These 
meta-heuristics have been tested to solve discrete problems [14]. Some of the nature- inspired algorithms are: 
particle swarm optimization (PSO) [15], [16], ant colony optimization (ACO) [17]-[19], artificial bee colony 
(ABC) [20], cuckoo search (CS) algorithm [21], [22], krill herd (KH) [23], and spotted hyena optimizer 
(SHO) [24]. 
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Exploration and extraction are two basic factors that control a metaheuristic algorithm. Exploration 
alludes to a metaheuristic algorithm’s capacity to search in diverse environment ranges to find the ideal 
solutions [25]-[27]. On the other hand, exploration is the ability to centralize search inside the optimal range 
to extract the optimal solution. A metaheuristic algorithm balances these two clashing objectives so that in 
any metaheuristic calculation or the progressed version, performance is moved forward by controlling the 
aforementioned parameters [27]. 

In [28], a new clusterized technique was introduced based on K-means to solve TSP by using the 
firefly algorithm which we extended that by using the WOA as a new studying. In TSP, with the view to 
minimize the cost function and hence improve the efficiency, the WOA is combined with the K-means. The 
same technique is applied to the unclustered algorithm, but based on the characteristics of the TSP problem 
[29] and the mechanism of the WOA, The maximum scale is determined to have 1323 cities, because the 
unclustered method is not as good as the clustered method, and we wanted to use the same datasets for both. 
The figures prove this in the Subsection 4.2. Section 2 provides some detailed information about the TSP, K- 
means, WOA algorithm, and mathematical models. In section 3, the proposed method is explained, while the 
results and discussions of the proposed method have been provided in section 4. Finally, the conclusion has 
been discussed in section 5. 


2. THE COMPREHENSIVE THEORETICAL BASIS 

In the following, we explain some theories in the subsection parts, which are numbered 2.1 to 2.4 
for the TSP problem, K-means, WOA, and mathematical models respectively. The mathematical models 
have 3 types that are explained with a peseodocode. These types are encircling Prey, bubble-Net attacking 
method, exploration phase or search for the prey. 


2.1. The travelling salesman problem (TSP) 

In the theory, the problem is explained on a graph G=(V, A), where we have n vertices 
V = {v1, v2,..., un}, and a distance matrix C = (cj). We have described A as a set of arcs [30] in the (1). 
In TSP problem, our objective function is the tour with minimum distance. If we want to solve that we need 


ie [31] comparisons, which makes it impossible to be solved theoretically. We want to improve the time 


and the complexity of this problem by using the WOA algorithm discovered by Mirjalili, and a data mining 
technique called K-means. Since TSP belongs to the vehicle routing problem (VRP) and has a minimum cost 
objective function [32], we apply heuristic algorithms to find the best solutions. 


A={(uy, u)| ui uj; € V, i+ 7} (1) 


2.2. K-means algorithm 

Clustering is a method that partitions our data into some groups, so-called clusters [33]-[38]. The 
steps of data mining start with K clusters, for K centroids. This approach which is iterative starts with a 
random selection of K objects for K clusters that is the first assignment. As it is explained in the algorithm 
below, the algorithm should compute the average of the objects for the new assignments of objects to the 
nearest cluster. 


Algorithm: K-means, 

Inputs: 

K: cluster numbers or the initial centroids, 

D: a set of n objects, 

Output: K clusters, 

Method: 

(1) arbitrarily choose K objects from D, 

(2) repeat, 

(3) according to the average of the objects within the cluster, (re)assign the objects to 
the closest one 

(4) update by calculating the average of the objects for each cluster, and new assignments 
(5) until no change happens in the clusters. 


2.3. The whale optimization algorithm 

This algorithm is a type of swarm intelligence algorithm which is inspired by the characteristics of 
whales [39]. The hunters look for a group of prey to encircle and gradually tighten the ring until they catch 
that prey [40]. Whales have a special hunting method that is called the bubble-net feeding method which is 
done by creating a distinctive bubble along a circle [41]. 
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2.4. Mathematical models 

Three mathematical models exist that based on them we explain the equations. These models are 
encircling prey, bubble-net attacking method (spiral bubble net feeding maneuver), and search for the prey. 
The equations are addressed in the pseudocode of the WOA algorithm. In each model, the D and X 
parameters obtain new values based on their new positions and distances. 


2.4.1. Encircling prey 

Whales don’t know the best position at the first step, so the target prey becomes the current best 
candidate solution, but in the next iterations after finding the best agent, the other search agents will try to 
change their positions to the best candidate solutions. The best search agent is the closest one. The other 
whales change their positions toward to the best answer. Based on Mirjalili and all, we have the (2) to (5): 


D'=|C.X*(t) — X(0| (2) 
X(t+D= X+*+®-AD (3) 
A= 24.7-4 (4) 
C=27 (5) 


D is the distance from (2), but A and C are coefficients, A=[-a, a] where a is a decreasing number 
from 2 to 0, and X represents the whale’s position, which we use as the agent, r is a number that is selected 
randomly in [0, 1], X(t+J) is the next position for the next whale, which can be a better answer, and t 
indicates to the iterations. X*(t) is the best solution among the whales that we have searched so far, and my 
change later. 


2.4.2. Bubble-net attacking method 

We have these methods; i) shrinking encircling mechanism, and ii) spiral updating position 
(exploration). In another word, The WOA algorithm adopted these two approaches to update the positions of 
the whales. The first method is realized by setting a by (4), except for the second approach, that’s spiral 
updating position, we’ve got the following equations, where D is the distance, b indicates a constant for 
logarithmic shape, and | stands for a value in this interval [-1,1]. 


X(t +1) = D’.e%!.cos(2ml) + X* (t) (6) 
D'= X¥«(t)— X(t) (7) 


Selection between these two approaches is based on the p-value that is a random number in [0, 1]. 
This is for updating or changing the position. If p < 0.5 we apply the (3) for X(t+1), otherwise if p = 0.5, we 
apply (6), as it is summarized (8): 


X*(t)-A.D if p < 0.5) 


D'.e”' cos(2ml) + X *(t) ifp>=0.5 


X(t+1)= 


2.4.3. Exploration phase 
If |A| = 1, the whales select a random whale. This step is called exploration, and they update their 


distances and positions by (9) and (10). Based on the new nares which is the position of a random whale, 


the values of D, and X (t + 1) change, as it is addressed in step 12 of the pseudocode. The parameter of a 
changes in an interval from 2 to zero to give the functionality of searching for the prey. This search is global. 


De IC .Xrana — X| (9) 
X(t+1) = Xranq —A-D (10) 


2.4.4. The pseudocode of the whale optimization algorithm (WOA) 
The pseudocode which is defined as follows describes the summarization of the above-mentioned 
mathematical models for WOA. After defining the population size, as each whale is an agent, we should find 
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its fitness. In this problem, our fitness is minimized. Before meeting the termination criterion in the proposed 
method, we update the values, and according to the equations, and conditions which we described earlier, we 
find the optimal position of the best whale with minimum fitness. 

: Initialize the population X; (i=1, 2, ... n) 

: Calculate the fitness of each search agent 

: X*=the best search agent 

: While (t < maximum number of iterations) 

: for each search agent 

: Update a, A, C, 1, and p 

:Ifl (p < 0.5) 

:if2 (|A| < 1) 

: Update the position of the current search agent by the (3) 

10: else if2( |A] = 1) 

11: Select a random search agent (Xrana) 

12: Update the position for the current agent by the (10) 

13: else if1(p = 0.5) 

14: Update the position of the current search by the (6) 

15: Check if any search agent exists beyond the search space and amend it 

16: Calculate the fitness of each search agent 

17: Update X* 


OADANNDNBRWNKH 


3. THE PROPOSED METHOD AND ALGORITHM 

In this work, we are proposing a method that combines K-means with WOA to solve TSP. This new 
TSP solver finds the number of clusters based on (11), where N is the number of cities [37]. First, we apply 
K-means to divide our data into K clusters, then we apply the WOA algorithm to find a tour in each cluster 
which is minimum, and in the final step, we connect our clusters to find the best solution that is the optimal 


path. 
= X 
K= 5 (11) 


We apply the next [28] procedure: 

i) We calculate the distances between the centroids of each cluster to find the clusters with minimum 
distances (the (12) for finding the distance). 

it) We combine the selected clusters in one bigger cluster [37]. Then we connect the tours of these two 
newfound clusters. 

iii) We repeat step 1 until the generation of the minimum tour. 


dij = Ie =x) +(n-y)° (12) 


In this method, each whale is an agent and can be a solution for TSP. The algorithm updates to find 
the current best solutions. In this algorithm, we consider 4 candidate nodes in step 6. These nodes are around 
the centroids of that cluster, and just the closer one will be joined to another cluster so that we always select 
the nearest city. The steps of our algorithm are: 

— Step 1: Initialize the number of population as shown in Table | 

— Step 2: Specify K based on the (11) 

— Step 3: Applying K-means algorithm 

— Step 4: Applying whale optimization algorithm for i=1: K 

— Step 5: Find the position of the cities 

— Step 6: Sorting by indexing 

— Step 7: Find the nodes (cities) in each cluster that are closer to the centroid of that clusters 
— Step 8: Join the closest cities to another cluster 

— Step 9: Stopping criterion till no cluster remains unjoined 
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4. RESULTS AND DISCUSSIONS 

The platform used for the implementation of this method is matrix laboratory (MATLAB), using 
Intel CORE 15, and 6 GB RAM. On the last page, our graphical illustrations and tables show the complexity 
of our algorithm before, and after applying K-means. Our datasets are Eil51, Linhp 318, and 111323. All data 
are available in the standard TSP library [42]. The vector a is assigned to have the maximum value of 2 for 
all of the iterations. 


4.1. Tables results before and after applying the clustering method 

In these tables, the average of our best tour (fitness) is considered as the benchmark. Table 1 shows 
some parameters. The initial population and iterations have the same value for both algorithms and are 
chosen based on some experimental results during the execution of the program, but they can have different 
values as well. Table 2 represents the results of the fitness function before applying K-means clustering 
method. We have three datasets, and for each, we have calculated some statistics consisting of the min, max, 
average, and standard deviation of the fitness function. The fitness average for the first approach has the 
values of 1176.42, 393928.2, and 8262213. Table 3 indicates the fitness function of the clustered method, 
which is done by using the K-means approach. The results of the comparison between these two tables prove 
that the fitness average for the first approach has improved to 489.8785, 82908.2, and 1.19E+06 for the 
second approach for the same datasets. The statistical values of these two tables show how the fitness 
function improves when our algorithm combines with K-means, especially for the third dataset. Table 4 
shows the execution time of the unclustered approach, which means solving a TSP problem using only the 
whales and without clustering. The time average has values of 3.765785 (s), 16.63365 (s), and 63.50511 (s), 
which shows high values especially for the third dataset with 1323 nodes (cities). As in Table 4, even the best 
value (min) for the third dataset takes 51.669 (s) to run. Table 5 shows the execution time for the clustered 
approach. Based on this table, time has improved since the time average is reduced from 3.765785 (s), 
16.63365 (s), and 63.50511 (s) to 1.2205855(s), 5.707375 (s), and 24.81385 (s) for the clustered method. The 
minimum value for the third dataset takes 20.345 (s) to run. It confirms that our algorithm is improved more 
than 50%. 


Table 1. The parameter settings 


Initial Population Iterations Number of the Cities 
100 20 51 
100 20 318 
100 20 1323 


Table 2. The fitness value for the unclustered whale optimization algorithm 


Dataset Min Max Average Stdev.s 
EilS1 1100.269 1262.614 1176.421 48.53605 
Linhp318 376614.9 409489.4 393928.2 8468.606 
R11323 8174234.322 8740947 8262213 49003.98 


Table 3. The fitness value for the clustered whale optimization algorithm 


Dataset Min Max Average Stdev.s 

Eil51 454.9 526.76 489.8785 21.2760405 
Linhp318 77273 88040 82908.2 3180.155 
R11323 1.14E+06 1.23E+06 1.19E+06 2.78E+04 


Table 4. The execution time for the unclustered whale optimization algorithm 


Dataset Min Max Average Stdev.s 
Eil51 3.1421 4.7562 3.765785 0.396958 
Linhp318 13.642 20.79 16.63365 1.77977 
R11323 51.669 79.21 63.50511 6.286887 


Table 5. The execution time for the clustered whale optimization algorithm 


Dataset Min Max Average Stdev.s 
Eil51 1.0021 1.4997 1.2205855 0.170868 1 
Linhp318 4.6754 7.953 5.707375 0.979334 
R11323 20.345 55.258 24.81385 7.647444 
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4.2. Figures 

In this section, the figures show how this designed method is used to find a more efficient route so 
that all nodes are met once, and the salesman returns to the starting point. As it is obvious, the figures 
become more complex as the number of nodes becomes more, so the illustration of all the connections 
becomes hard as well, therefore the first illustration which is Figure | for the Eil51 dataset is a clearer 
indication of the calculations that have been done. Figure 1(a) shows a graph, which demonstrates the routes 
of the salesman from the time when our algorithm starts. Figure 1(b) demonstrates the salesman routing but 
with the help of clustering, so this figure finds the connections for the smallest distances, which are more 
optimized. In Figure 2 the TSP is solved for Linhp318 such that in (a) unclustered method and (b) clustered 
approach are presented. Figure 2(a) is more complicated than Figure 2(b). The number of the dataset is 318, 
but in Figure 2(b), the problem is solved with less complexity, and the reason is using the K-means clustering 
approach. This approach is helped to find better connections. 


30 


(b) 


Figure 1. Comparison between different approaches for the Eil51 dataset (a) unclustered approach and 
(b) clustered approach 
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Figure 2. Comparison between different approaches for the Linhp318 dataset (a) unclustered approach and 
(b) clustered approach 
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Figure 3 compares the results in the algorithm performance for the rl1323 dataset. In (a) unclustered method 
and (b) clustered approach are applied. Figure 3(a) shows more complexity than the previous figures. The 
reason is the number of the dataset with 1323 nodes, which means the salesman should travel between all 
these nodes and return to the first node. Solving such a problem with this amount of data practically is 
impossible but with the help of whales, it became solved with high complexity in Figure 3(a). Figure 3(b) is 
solved this problem with less complexity because of the clustering approach. 
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Figure 3. Comparison between different approaches for the R11323 dataset (a) unclusterd approach and 
(b) clustered approach 


5. CONCLUSION AND FUTURE WORKS 

This article introduced two approaches to solve a TSP: an unclustered method, and a combinatorial 
approach based on WOA, and K-means. The WOA algorithm, and K-means are combined to solve a TSP 
problem. This approach divides the problem into some clusters and applies WOA for these small clusters. In 
the end, this combined algorithm joins the nearest clusters. The results generate the optimal solution 
concerning the iterative best cost. This new clusterized heuristic has proven to dominate the first approach of 
solving the same problem based on mentioned tables, and figures. Future research will be for solving more 
high-scale optimization problems. Furthermore, this article is a prime studying for future research on 
clustering since the WOA algorithm can be hybridized with other clustering algorithms to find more efficient 
solutions than the proposed method. 
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